441 research outputs found

    An Autoencoder-Based Image Descriptor for Image Matching and Retrieval

    Get PDF
    Local image features are used in many computer vision applications. Many point detectors and descriptors have been proposed in recent years; however, creation of effective descriptors is still a topic of research. The Scale Invariant Feature Transform (SIFT) developed by David Lowe is widely used in image matching and image retrieval. SIFT detects interest points in an image based on Scale-Space analysis, which is invariant to change in image scale. A SIFT descriptor contains gradient information about an image patch centered at a point of interest. SIFT is found to provide a high matching rate, is robust to image transformations; however, it is found to be slow in image matching/retrieval. Autoencoder is a method for representation learning and is used in this project to construct a low-dimensional representation of a high-dimensional data while preserving the structure and geometry of the data. In many computer vision tasks, the high dimensionality of input data means a high computational cost. The main motivation in this project is to improve the speed and the distinctness of SIFT descriptors. To achieve this, a new descriptor is proposed that is based on Autoencoder. Our newly generated descriptors can reduce the size and complexity of SIFT descriptors, reducing the time required in image matching and image retrieval

    Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation

    Get PDF
    In reinforcement learning, domain randomisation is an increasingly popular technique for learning more general policies that are robust to domain-shifts at deployment. However, naively aggregating information from randomised domains may lead to high variance in gradient estimation and unstable learning process. To address this issue, we present a peer-to-peer online distillation strategy for RL termed P2PDRL, where multiple workers are each assigned to a different environment, and exchange knowledge through mutual regularisation based on Kullback-Leibler divergence. Our experiments on continuous control tasks show that P2PDRL enables robust learning across a wider randomisation distribution than baselines, and more robust generalisation to new environments at testing

    ODAM: Gradient-based instance-specific visual explanations for object detection

    Full text link
    We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visualized explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute. Compared to previous works classification activation maps (CAM), ODAM generates instance-specific explanations rather than class-specific ones. We show that ODAM is applicable to both one-stage detectors and two-stage detectors with different types of detector backbones and heads, and produces higher-quality visual explanations than the state-of-the-art both effectively and efficiently. We next propose a training scheme, Odam-Train, to improve the explanation ability on object discrimination of the detector through encouraging consistency between explanations for detections on the same object, and distinct explanations for detections on different objects. Based on the heat maps produced by ODAM with Odam-Train, we propose Odam-NMS, which considers the information of the model's explanation for each prediction to distinguish the duplicate detected objects. We present a detailed analysis of the visualized explanations of detectors and carry out extensive experiments to validate the effectiveness of the proposed ODAM.Comment: 2023 International Conference on Learning Representation

    Deep Reinforcement Learning for Resource Management in Network Slicing

    Full text link
    Network slicing is born as an emerging business to operators, by allowing them to sell the customized slices to various tenants at different prices. In order to provide better-performing and cost-efficient services, network slicing involves challenging technical issues and urgently looks forward to intelligent innovations to make the resource management consistent with users' activities per slice. In that regard, deep reinforcement learning (DRL), which focuses on how to interact with the environment by trying alternative actions and reinforcing the tendency actions producing more rewarding consequences, is assumed to be a promising solution. In this paper, after briefly reviewing the fundamental concepts of DRL, we investigate the application of DRL in solving some typical resource management for network slicing scenarios, which include radio resource slicing and priority-based core network slicing, and demonstrate the advantage of DRL over several competing schemes through extensive simulations. Finally, we also discuss the possible challenges to apply DRL in network slicing from a general perspective.Comment: The manuscript has been accepted by IEEE Access in Nov. 201

    Understanding the Performance of Learning Precoding Policy with GNN and CNNs

    Full text link
    Learning-based precoding has been shown able to be implemented in real-time, jointly optimized with channel acquisition, and robust to imperfect channels. Yet previous works rarely explain the design choices and learning performance, and existing methods either suffer from high training complexity or depend on problem-specific models. In this paper, we address these issues by analyzing the properties of precoding policy and inductive biases of neural networks, noticing that the learning performance can be decomposed into approximation and estimation errors where the former is related to the smoothness of the policy and both depend on the inductive biases of neural networks. To this end, we introduce a graph neural network (GNN) to learn precoding policy and analyze its connection with the commonly used convolutional neural networks (CNNs). By taking a sum rate maximization precoding policy as an example, we explain why the learned precoding policy performs well in the low signal-to-noise ratio regime, in spatially uncorrelated channels, and when the number of users is much fewer than the number of antennas, as well as why GNN is with higher learning efficiency than CNNs. Extensive simulations validate our analyses and evaluate the generalization ability of the GNN

    Demand Forecast in Retail Assortment Optimization—Based on an Empirical Analysis of Beverage Sales

    Get PDF
    This paper focus on establishing the demand forecasting model to optimize product assortments from a set of SKUs in the same category. The aim of the model is to achieve revenue maximization. Based on the attribute level, the demand model considers the consumers’ preference and the possibility of substitution between different attributes. Then it divides the product’s specific attributes and multiplies these attributes effects. Furthermore, one beverage case was applied to the demand model to do empirical analysis. Top beverage categories were selected and e-commerce sales data were collected to represent the pre-sale of whole categories. Moreover, a store named S with some beverage SKUs is assumed and applied to the model, which predicted sales volume of each existing SKU and the total revenue

    Generalisation in deep reinforcement learning with multiple tasks and domains

    Get PDF
    A long standing vision of robotics research is to build autonomous systems that can adapt to unforeseen environmental perturbations and learn a set of tasks progressively. Reinforcement learning (RL) has shown great success in a variety of robot control tasks because of recent advances in hardware and learning techniques. To further fulfil this long term goal, generalisation of RL arises as a demanding research topic as it allows learning agents to extract knowledge from past experience and transfer to new situations. This covers generalisation against sampling noise to avoid overfitting, generalisation against environmental changes to avoid domain shift, and generalisation over different but related tasks to achieve lifelong knowledge transfer. This thesis investigates these challenges in the context of RL, with a main focus on cross-domain and cross-task generalisation. We first address the problem of generalisation across domains. With a focus on continuous control tasks, we characterise the sources of uncertainty that may cause generalisation challenges in Deep RL, and provide a new benchmark and thorough empirical evaluation of generalisation challenges for state of the art Deep RL methods. In particular, we show that, if generalisation is the goal, then the common practice of evaluating algorithms based on their training performance leads to the wrong conclusions about algorithm choice. Moreover, we evaluate several techniques for improving generalisation and draw conclusions about the most robust techniques to date. From the evaluation, we can see that learning from multiple domains improves generalisation performance across domains. However, aggregating gradient information from different domains may make learning unstable. In the second work, we propose to update the policy to minimise the sum of distances to the new policies learned in each domain in every iteration, measured by Kullback-Leibler (KL) divergence of output (action) distributions. We show that our method improves both the training asymptotic reward and testing policy robustness against domain shifts in a variety of control tasks. We finally investigate generalisation across different classes of control tasks. In particular, we introduce a class of neural network controllers that can realise four distinct tasks: reaching, object throwing, casting, and ball-in-cup. By factorising the weights of the neural network, transferable latent skills are exacted which enable acceleration of learning in cross-task transfer. With a suitable curriculum, this allows us to learn challenging dexterous control tasks like ball-in-cup from scratch with only reinforcement learning
    • …
    corecore